Sains Malaysiana 53(11)(2024): 3607-3615

http://doi.org/10.17576/jsm-2024-5311-05

 

Al-Khawarizmi Heuristik bagi Pautan Data dalam Menganggarkan Bilangan Kemalangan Jalan Raya Tidak Terlapor

(A Heuristic Algorithm of Data Linkage in Estimating the Number of Unreported Traffic Accidents)

 

ZAMIRA HASANAH ZAMZURI* & NOR WAZIRAH RADZMAN SHAH

 

Jabatan Sains Matematik, Fakulti Sains dan Teknologi, Universiti Kebangsaan Malaysia, 43600 UKM Bangi, Selangor, Malaysia

 

Diserahkan: 30 April 2024/Diterima: 7 Ogos 2024

 

Abstrak

Analisis data kemalangan jalan raya adalah sangat penting bagi merancang strategi pencegahan yang optimum serta meminimumkan risiko berlakunya kemalangan. Bilangan kemalangan jalan raya yang dilaporkan sering kali menunjukkan kekerapan sifar yang tinggi, yang dipercayai berasal daripada situasi kemalangan yang tidak dilaporkan. Maka, penganggaran kemalangan tidak dilaporkan adalah amat penting bagi mengelakkan risiko terkurang anggaran dan ketidaktepatan dalam analisis kemalangan jalan raya. Salah satu cara untuk menganggarkan kemalangan tidak dilaporkan ini adalah menerusi perbandingan dua set data dan kadar entri data yang tidak dapat dipadankan menjadi kadar kemalangan tidak dilaporkan. Kajian ini menggunakan teknik pautan data berkebarangkalian bagi memautkan dua set data kemalangan jalan raya yang berasal daripada laporan polis dan rekod hospital dari Januari sehingga Mac 2011. Satu al-Khawarizmi heuristik dibangunkan berdasarkan keperluan semasa proses pautan data dijalankan. Unsur heuristik ini menitikberatkan proses pautan data secara berperingkat bagi mengenal pasti set pengecam yang tidak unik yang digunakan serta tapisan data yang bersesuaian dan rasional bagi anggaran yang ingin dicapai. Seterusnya penukaran unit bagi setiap entri data dari per individu ke per kemalangan juga diperlukan kerana matlamat akhir adalah untuk memperoleh jumlah kemalangan yang tidak dilaporkan. Pautan data yang dijalankan dalam kajian ini menggunakan pengecam bukan unik seperti jantina, umur, bangsa dan jenis kenderaan. Berdasarkan data yang dipautkan dan proses penganggaran yang dilaksanakan, dianggarkan sekitar 68% kemalangan adalah tidak dilaporkan dengan bilangan sebanyak 6366. Al-Khawarizmi heuristik yang dibangunkan ini dapat digunakan untuk pautan data kemalangan jalan raya antara laporan polis dan rekod hospital di Malaysia. Perbandingan antara kemalangan yang dilaporkan dan tidak dilaporkan dalam data hospital turut mendedahkan bahawa kebanyakan kemalangan yang tidak dilaporkan melibatkan kesalahan jenayah serius seperti penggunaan dadah dan alkohol berlebihan.

 

Kata kunci: Heuristik; kemalangan jalan raya; pautan data; tidak dilaporkan

 

Abstract

Traffic accident data analysis is vital in order to plan for optimal preventive measures and minimizing the risk of accident occurrence. Oftentime, the traffic accident count data exhibit extra zeros, believed to be sourced from underreporting scenarios. Hence, the estimation of unreported accidents is needed to avoid under-estimation risk and inaccuracy of traffic accident data analysis. One of the ways in estimating the unreported accidents is through comparing two data sets and the proportion of unmatched entries is estimated to be the underreporting rate. In this study, the probabilistic data linkage techniques is used to link two traffic accident data sets sourced from police report and hospital records from Jan to Mar 2011. A heuristic algorithim is developed based on the needs found during the linkage process. The heuristic elements can be found in the staged data linkage process to establish the best set of non-unique identifiers and also on identifying suitable and rational filtered data to be used in the estimation. Then, the unit for data entry needs to be converted from per individual to per accident, since the ultimate aim of this study was to estimate the number of unreported accidents. In the performed data linkage process, the non-unique identifiers used are gender, age, race and vehicle type. Based on the linked data and estimation process performed, the estimate of unreported accidents is around 68% and the estimated number of reported accident is 6366. The developed algorithm can be used in linking traffic accident data based on police report and hospital record in Malaysia. The comparison of reported and unreported accidents in the hospital record shows that most unreported accidents are involving serious offences such as excessive drug and alcohol usage. 

 

Keywords: Data linkage; heuristic; traffic accidents; unreported

 

RUJUKAN

Ahmed, S.K., Mohammaed, M.G., Abdulqadir, S.O., Abd El-Kader, R.G., El-Shall, N.A., Chandran, D., Ur Rehman, M.E. & Dhama, K. 2023. Road traffic accidental injuries and deaths: A neglected global health issue. Health Science Report 6(5): e1240. doi: 10.1002/hsr2.1240

Ali Omar, Z., Zamzuri, Z.H., Mohd Ariff, N. & Abu Bakar, M.A. 2023. Training data selection for record linkage classification. Symmetry 15(5): 1060.

Boufous, S., Finch, C., Hayen, A. & Williamson, A. 2008. Data Linkage of Hospital and Police Crash Datasets in NSW. Technical Report. Sydney: NSW Injury Risk Management Research Centre, University of New South Wales.

Dale, S. 2015. Heuristics and biases: The science of decision making. Business Information Review 32(2): 93-99.

David, I., Vangheluwe, H. & Syriani, E. 2023. Model consistency as a heuristic for eventual correctness. Journal of Computer Languages 76: 101223.

Isa, Z. & Zamzuri, Z.H. 2022. Pengukuran risiko menggunakan Rangkaian Bayesan: Aplikasi kepada data perlanggaran kapal di Malaysia. Sains Malaysiana 51(7): 2305-2314

Kamaluddin, N.A., Abd Rahman, M.F. & Várhelyi, A. 2018. Matching of police and hospital road crash casualty records - a data-linkage study in Malaysia. International Journal of Injury Control and Safety Promotion 26(1): 52-59.  doi:10.1080/17457300.2018.1476385

Kementerian Pengangkutan Malaysia. 2019. Statistik Pengangkutan Malaysia. https://www.mot.gov.my/en/Statistik%20Tahunan%20Pengangkutan/Transport%20Statistics%20Malaysia%202019.pdf (Diakses pada 1 Ogos 2024).

Khodabakhshian, A., Puolitaival, T. & Kestle, L. 2023. Deterministic and probabilistic risk management approaches in construction projects: A systematic literature review and comparative analysis. Buildings 13(5): 1312.

Mack, C. 2014. PS1-13: Probabilistic linkage (also known as “fuzzy matching”): The theoretical foundations of modern record linkage. Clinical Medicine and Research 12(1-2): 95.

Maxwell, O., Mayowa, B.A., Chinedu, I.U. & Peace, A.E. 2018. Modelling count data; A generalized linear model framework. American Journal of Mathematics and Statistics 8(6): 179-183.

Microsoft Learn. 2024. Power Query M Formula Language. https://learn.microsoft.com/en-us/powerquery-m/ (Diakses pada 1 Ogos 2024).

Mosleh, M.A.A., Assiri, A., Gumaei, A.H., Alkhamees, B.F. & Al-Qahtani, M. 2024. A bidirectional Arabic sign language framework using deep learning and fuzzy matching score. Mathematics 12(8): 1155.

Muni, K.M., Ningwa, A., Osuret, J., Zziwa, E.B., Namatovu, S., Biribawa, C., Nakafeero, M., Mutto, M., Guwatudde, D., Kyamanywa, P. & Kobusingye, O. 2021. Estimating the burden of road traffic crashes in Uganda using police and health sector data sources. Injury Prevention 27: 208-214.

Nik Zamri, N.S. & Zamzuri, Z.H. 2019. Estimating the proportion of non-fatality unreported traffic accidents in Malaysia. ASM Sc. J. 12(1): 239-245.

Nik Zamri, N.S., Zamzuri, Z.H. & Ibrahim, K. 2018. Factors influencing Malaysian drivers' tendency on underreporting. International Journal of Engineering and Technology 7(4): 6313-6321.

Radzman Shah, N.W. & Zamzuri, Z.H. 2023. Underreporting of road traffic accidents: A bibliometric analysis from Web of Science database. Journal of Quality Measurement and Analysis 19(3): 55-71.

Samuel, J.C., Sankhulani, E., Qureshi, J.S., Baloyi, P., Thupi, C., Lee, C.N., Miller, W.C., Cairns, B.A. & Charles, A.G. 2012. Under-reporting of road traffic mortality in developing countries: Application of a capture-recapture statistical model to refine mortality estimates. PloS ONE 7(2): e31091.

Shin, D., Rasul, A. & Fotiadis, A. 2021. Why am I seeing this? Deconstructing algorithm literacy through the lens of users. Internet Research 32: 1214-1234.

Shinar, D., Valero-Mora, P., van Strijp-Houtenbos, M., Haworth, N., Schramm, A., Bruyne, G.D., Cavallo, V., Chliaoutakis, J., Dias, J., Frraro, O.E., Fyhri, A., Sajatovic, A.H., Kuklane, K., Ledesma, R., Mascarell, O., Morandi, A., Muser, M., Otte, D., Papadakaki, M., Sanmartín, J., Dulf, D., Saplioglu, M. & Tzamalouka, G. 2018. Under-reporting bicycle accidents to police in the COST TU1101 international survey: Cross-country comparison and associated factors. Accident, Analysis and Prevention 110: 177-186.

Singh, P., Laksmi, P.V.M., Prinja, S. & Khanduja, P. 2018. Under-reporting of road traffic accidents in traffic police records - A cross-sectional study from North India. International Journal of Community Medicine and Public Health 5(2): 579-584.

Ward, H., Lyons, R. & Thoreau, R. 2006. Under-reporting of Road Casualties? Phase 1. Road Safety Research Report No. 69. London: Department for Transport.

Watson, A., Vallmuur, K. & Watson, B. 2015. How serious are they? The use of data linkage to explore different definitions of serious road crash injuries. Proceedings of the 2015 Australasian Road Safety Conference in Gold Coast, Australia. hlm. 1-10.

World Health Organization (WHO). 2023. Road traffic injuries. https://www.who.int/news-room/fact-sheets/detail/road-traffic-injuries (Diakses pada 1 Ogos 2024).

Ytterstad, B., Gressnes, T. & Harborg, T. 2018. PW 1663 Injury surveillance in a hospital leads to complete traffic injury data, sustainable injury prevention, and update police underreporting. Injury Prevention 24(2): A179.

Zamzuri, Z.H. 2021. Underreporting traffic accidents in Malaysia-A sentiment analysis. ITM Web of Conferences 36: 01015.

 

*Pengarang untuk surat-menyurat; email: zamira@ukm.edu.my

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

   

sebelumnya